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IMPROVEMENTS IN THE EXPRESSION OF NEWLY INTRODUCED 
GENES IN YEAST CELLS 

The present invention relates to improvements- in the 
expression of newly introduced genes in yeast cells. 

In particular the invention relates to a DNA sequence 
5 capable of initiating transcription by yeast RNA poly- 
merase II which includes at least part of the regulon 
region of one of the GAPDH genes of S. cerevisiae . 
(DNA = DeoxyriboNucleic Acid; RNA = RiboNucleic Acid; 
GAPDH = GlycerAldehyde-3-Phosphate DeHydrogenase ; 
10 mRNA = messenger RNA) . 

The invention further relates to the use of a DNA 
sequence capable of both termination of the tran- 
scription by yeast RNA polymerase II and effecting 
15 polyadenylation of the mRNA, which includes at least 
part of the termination/polyadenylation region be- 
longing to one of the GAPDH genes of S. cerevisiae . 

The invention also relates to a larger rDNA sequence 
20 which contains at least the above-indicated regulon 

region of one of the GAPDH genes of S. cerevisiae and 
one or more structural genes different from the GAPDH 
genes of S. cerevisiae , which DNA sequence can be in- 
serted into a recombinant DNA plasmid or into a yeast 
25 chromosome in order to transform yeasts so that they 

become able to produce the desired proteins encoded by 
the structural genes. 

Finally, the invention relates to a process for pre- 
30 paring a protein by cultivating a yeast containing the 
above-mentioned larger rDNA sequences under conditions 
whereby the protein is formed and isolating the protein 
from that yeast culture, as well as the proteins pro- 
duced by such a process. 
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BACKGROUND OF THE INVENTION 

Developments in recombinant DNA (= rDNA) technology 
have made it possible to isolate or synthesize specific 
5 genes or portions thereof from higher organisms such as 
man, animals and plants, and to transfer these genes or 
gene fragments to microorganisms such as bacteria or 
yeasts* The transferred gene is replicated and propa- 
gated as the transformed microorganism replicates. As a 

10 result, the transformed microorganism may become en- 
dowed with the capacity to make whatever protein the 
transferred gene or gene fragment encodes whether it is 
an enzyme, a hormone, an antigen, an antibody, or a 
portion thereof. The microorganism passes on this capa- 

15 bility to its progeny, so that in effect, the transfer 
has resulted in a new microbial strain, having the 
described capability. 

A basic fact underlying the application of this tech- 
20 nology for practical purposes is that DNA of all 

living organisms, from microbes to man, is chemically 
similar, being composed of the same four nucleotides • 
For example, the same nucleotide sequence which codes 
for the amino acid sequence specifying preprochymosin 
25 in stomach cells of newborn mammals, will, when trans- 
ferred to a microorganism, be recognized as coding for 
the same amino acid sequence. 

The basic constituents of the recombinant DNA tech- 
30 nology are formed by: 

i) the gene encoding the protein of interest- 
ii) a vector (plasmid) in which the new gene has to 
be inserted to guarantee stable replication and a 
high level of expression of the gene. 
35 iii) a suitable host microorganism in which the vector 
carrying the new gene can be introduced. 
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Depending on the nature of the protein to be syn- 
thesized, the industrial application of this protein 
and the technically possible fermentation and puri- 
fication procedu'fes , the plasmid vector and the host 
5 organism have tp.\be selected. In most cases the selec- 
tion of a host organism which is unsuspected with 
regard to the production of toxic substances, i.e. a 
microorganism mentioned in the GRAS (Generally 
Recognized As Safe) list, will be highly important. 

10 However, only veify few of these GRAS-microorganisms 

meet all requirements asked for by the application of 
either the recombinant DNA or the fermentation tech- 
nology. A selection on the basis of these positive 
criteria shows Vhat, at this moment, certain yeast 

15 species, notably-. Saccharomyces, Kluyveromyces , Debaryo- 
myces, HansenulW, Candida, Torulopsis and Rhodotorula, 
can be regarded-as very promising host organisms for 
genetically engineered DNA molecules. 

20 In the present invention use is made of recombinant 

DNA, molecular, biological and chemical techniques to 
construct plasmid vectors that can be stably maintained 
within yeasts and, most importantly, contain the appro- 
priate regulons to bring about a high level of ex- 

25 pression of newly introduced genes. 

Several plasmid- vectors are known nowadays which can be 
used for the transformation of the yeast Saccharomyces 
. cerevisiae [C.P.Hollenberg, Current Topics in Microb. 

30 and Immunol. 96, 119-144 (1982)and A. Hinnen and B. 
Meyhack, Current Topics in Microb. and Immunol. £6, 
101-117 (1982)]. These vectors rely on either autono- 
mous replication sequences (ARS) isolated from the 
chromosomal DNA of particular yeast species or the 

35 replicating ability of the 2 micron DNA plasmid found 
in Saccharomyces cerevisiae to maintain the vector and 
the inserted gene within the host cell. Additionally 
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these yeast vectors contain a marker by which trans- 
formants can be selected. Examples of such markers are 
the leu 2 gene [A. Hinnen et al., Proc.Natl. Acad. Sci. 
USA, 75, 1929-1933(1978)], the trp 1 gene [K.- Struhl et 
5 al., Proc.Natl .Acad. Sci. USA, 76, 1035-1039(1979)], the 
lactase gene [R.C. Dickson, Gene,l£, 347-356 (1980)], 
and genes which confer resistance of the host cell $ 
against certain antibiotics. 

* 

10 The stability of AR-sequences in S. cerevisiae or (K)AR 
sequences in Kluyveromyces 1 act is or K. fragilis is not 
always sufficient for the development of a reliable 
fermentation process using these yeasts. Therefore, 
integration of the foreign structural gene into the 

15 chromosome(s) of the new host cell can be very impor- 
tant for the industrial application of rDNA-containing 
yeasts . 

Seen from an economic point of view not only the 
20 stability of the inserted gene within the yeast cell is 
important but also the efficiency with which this gene 
is expressed as protein. Based on todays knowledge, the 
main routes to achieve high levels of expression of a 
newly inserted gene include: 

25 

- integration of the structural gene downstream of a 
promoter (RNA- initiation), site. which can effect a 
high transcription frequency of the gene. Ideal is 
that the promoter activity is inducible, i.e. can be 

30 switched on or off depending upon a temperature shift 

or the presence of an inducer in the growth medium. 4 
Potent promoters operating in yeasts are those 
responsible for transcription of the genes encoding s 
glycolytic enzymes. Experimental work done by Maitra 

35 and Lobo CJ.Biol.Chem. , 246 , 489-499 (1971)] suggests 

furthermore that some . of these promoters are highly 
inducible. However a serious difficulty with regard 
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to the isolation of such promoters is that up to now 
little is known about the nucleotide sequences which 
confer regulation and full promoter activity on the 
DNA fragment 

5 

- integration of an RNA (RiboNucleic Acid) termination 
signal downstream of the structural gene. Owing to 
the presence of such a termination signal , tran- 
scription of the structural gene cannot interfere 

10 with the transcription of adjacent operons. Moreover 

transcription seems to be more efficient [K.S. Zaret 
and F • Sherman, Cell, 28, 563-573 (1982)] and poly- 
adenylation of the messenger is likely to occur 
correctly, resulting in a more stable mRNA (messenger 

15 RNA) population. However, up to now exact data on" 

nucleotide sequences required for termination of 
transcription in yeasts are not available. 

- the presence of a nucleotide sequence flanking the 
20 AUG codon in the RNA-molecule which is optimal for 

protein synthesis initiation. According to published 
data [M. Kozak, Nucleic Acids Res. 9, 5233-5252 
(1981)] the positions -3 and +4, N X X A U G N are 
highly conserved (A or C at «^3 and G at +4), which 

2 5 observation suggests a role for these nucleotides in 

the recognition of the AUG codon as a translation 
start point by the ribosome. Although one might 
expect an efficient yeast promoter to contain either 
an A or C as nucleotide at the -3 position, the 

30 nucleotide at the +4 position forms part of the 

coding sequence and is therefore dependent upon the 
nature of the gene to be inserted downstream of the 
promoter. This indicates that it will be difficult to 
fulfil this condition in all cases. 

35 

- the copy number of the vector within the host cell. 
In most cases high copy numbers will lead to higher 
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mRNA levels and, therefore, to higher expression 
levels of a gene. In S. cerevisiae vectors containing 
the 2 micron DNA replication origin can reach copy- 



vectors containing autonomously replicating sequences 
(ARS) [K. Struhl et al., Prbc . Nat 1 . Acad • Sci • USA , 76, 
1035-1039 (1979) and A.J. Kingsman et al. , Gene, 7, 
141-152 (1979)]. However, in yeast species other than 
Saccharomyces, 2 micron DNA has not yet been found, 
suggesting that its replication origin will be 
functional in a very limited number of yeast species 
only. Experimental data obtained so far show that the 2 
micron replication origin is functional in 
Schizosaccharomyces pombe CD. Beach and P. Nurse, 
Nature, 290 , 140-142 (1981)], but not in Kluyveromyces 
lactis (G. Das and CP. Hollenberg, Curr. Genet. 5_ f 
123-128 (1982)]. 

Therefore the transformation of yeasts belonging to 
genera other than Saccharomyces or Schizosaccharo - 
myces will be dependent in most cases upon the avail- 
ability of other DNA replication origins such as ARS 
isolated from the organism to be transformed or upon 
the integration of the foreign gene into the yeast 
genome . 

- a codon use of the gene which is optimal for the host 
organism used. Results obtained with the yeast S. 
cerevisiae show a strong correlation between the 
abundance of certain tRNA (transfer RNA) species and 
the occurrence of the respective codons in its 
protein genes. Therefore, optimal expression of for 
instance the bovine preprochymosin gene or the plant 
preprothaumatin gene in S. cerevisiae would require 
a chemical synthesis of both genes with a codon 
population which correlates with these abundant yeast 
tRNA species. 



numbers as high as 50. This is considerably, more than 
can be reached by for instance integrating vectors or 
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- an additional factor which might influence the trans- 
lation of a gene is the presence of a DNA sequence 
encoding a so-called signal sequence. These signal 
sequences are in most cases hydrophobic N-terminal 
5 protein extensions which are often involved in the 

process of cotranslational secretion of the protein 
through a membrane. In the present specification new 
data are shown obtained with the expression of the 
preprothaumatin gene in yeast, indicating that when 
10 the DNA sequence encoding the signal protein is 

removed from the gene, expression of the gene is 
reduced by a few orders of magnitude. 

At present the number of known yeast species which can 
15 serve as a host for recombinant DNA molecules is 
limited to Saccharomyces cerevisiae and Schizo - 
saccharomyces pombe . 

SUMMARY OF THE INVENTION 

20 . 

Therefore, a need exists for additional yeast species 
with other metabolic properties and other nutritional 
demands, so that the field to which recombinant DNA 
technology can be applied on a commercial level will be 

25 broadened. However, the use of new yeast species 

requires suitable DNA replication origins to guarantee 
a stable replication of the vector molecule in the new 
host as well as an appropriate expression system for 
newly inserted DNA. The present invention provides an 

30 expression system, the use of which is not restricted 
to S. cerevisiae but which can function in other yeast 
species as well. In order to achieve this the RNA 
initiation/regulation and the RNA termination/ 
polyadenylation signals of a glyceraldehyde-3-phosphate 

35 dehydrogenase (GAPDH) gene of S. cerevisiae were 
isolated. 
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Besides the fact that the RNA initiation site of a 
GAPDH gene is a very efficient promoter in S. 
cerevisiae [M.J* Holland et al. Biochemistry 12, 4900- 
4907 (1978)] the promoter according to the present 
5 invention was isolated and applied after it was 

realized that GAPDH is a metabolic key enzyme in many 
different organisms, suggesting a certain conservatism 4 
during evolution with regard to the nucleotide sequence 
regulating its expression. On this basis further ex- * 

10 periments were carried out. It was observed (i) that 
the isolated and radioactively labelled GAPDH reguloh 
fragment hybridized with colonies of various yeast 
species, (ii) the S. cerevisiae GAPDH regulon expressed 
foreign genes in K. lactis efficiently and (iii) the 

15 larger regulon region of about 850 nucleotides was more 
effective than the smaller regulon region of about 280 
nucleotides published by Holland J. P. and Holland M.J. 
in J.Biol.Chem. 255 , 2596-2605 (1980) . These new 
findings gave us. confidence that the regulon isolated 

20 will prove to be useful in the expression of foreign 
DNA in a number of other yeast species. 

The present invention provides a DNA sequence capable 
of initiating transcription by yeast RNA polymerase II 

25 which includes at least part of the regulon region of 
one of the GAPDH genes of £. cerevisiae , characterized 
in that it comprises a DNA sequence essentially as 
given in Fig. 2, and wherein the regulon region is 
optionally modified to include at least one restriction 

30 enzyme cleavage site, to facilitate manipulation of the 

nucleotide sequence region for protein synthesis t 
initiation. 

An example of a modification of the regulon is given in 
35 item 4 and Figs. 7A and 8, where the introduction of a 
Sac I site is described. Although in Fig. 2 only one 
specific DNA sequence is described, it will be clear to 
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the expert that modifications of this DNA sequence, 
either by replacement of one or more nucleotides or by 
addition or deletion of one or more nucleotides, which 
modifications do not impair the properties of • the 
5 regulon region given in Fig. 2, are within the realm 
of the invention. 

The invention further provides a DNA sequence capable 
of both termination of the transcription by yeast RNA 
10 polymerase II and effecting polyadenylation of the 

mRNA, which includes at least part of the termination/ 
polyadenylation region belonging to one of the GAPDH 
genes of S. cerevisiae , characterised in that it com- 
prises a DNA sequence essentially as given in Fig/ 3, 

15 

Indications have been obtained that the presence of 
such DNA sequence is favourable for the expression of 
structural genes in yeasts. The presence of such DNA 
sequence seems to be particularly advantageous for in- 
20 corporating the rDNA in the genome of a yeast cell. 

The invention also provides a DNA sequence, which can 
be inserted into a recombinant DNA plasmid or into a 
yeast chromosome, comprising: 
25 (a) a DNA sequence essentially as given in Fig. 2, and 

(b) one or more structural genes different from the 
GAPDH genes of S. cerevisiae , and at least two of 
features (c)-(f), 

(c) one or more specific DNA sequences capable of ter- 
30 minating the transcription by yeast RNA polymerase 

II and effecting polyadenylation of the mRNA, 
and /or 

(d) one or more selection markers, and/or 

(e) either one or more nucleotide sequences allowing a 
35 stable insertion in a chromosome of yeasts or one 

or more DNA sequences which regulate DNA repli- 
cation in yeasts belonging to the genus Saccharo- 
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myces or to the genus Kluyveromyces , and/or 
(f) a DNA sequence encoding a signal polypeptide of not 
more than 30 amino acid residues assisting the 
translocation of proteins, which DNA sequence is 
5 situated upstream of and in the same reading frame 

as the structural gene. 

In particular, the structural gene of (b) encodes thau- 
matin or chymosin, or their various allelic or modified 

10 forms, which modified forms do not impair the sweet- 
tasting properties of thaumatin or the milk-clotting 
properties of chymosin, respectively, or precursors of 
these thaumatin-like or chymosin-like proteins, since 
such DNA sequence will assist in giving an increased 

15 expression of these genes. 

A preference exists to use a DNA sequence essentially 
as given in Fig. 3 as the specific DNA sequence of (c), 
since this seems more adapted to yeast than other 
20 termination/polyadenylation regions. 

The DNA sequence of (f) encoding a signal polypeptide 
can be selected from the group consisting of DNA 
sequences encoding 
25 (a) the signal polypeptide translocating S. cerevisiae 
invertase , namely Met . Leu . Leu . Gin . Ala . Phe . Leu . Phe . - 
Leu . Leu . Ala . Gly . Phe. Ala . Ala . Ly s . lie . Ser . Ala ; 

(b) the signal polypeptide translocating S. cerevisiae 
acid phosphatase, namely Met.Phe.Lys.Ser.Val..Val,- 

30 Tyr. Ser . lie. Leu. Ala. Ala. Ser .Leu. Ala. Asn. Ala; 

(c) the signal polypeptide of unmatured forms, of 
thaumatin-like proteins,, namely Met. Ala. Ala. Thr . - 
Thr. Cys. Phe. Phe. Phe. Leu. Phe. Pro. Phe. Leu. Leu. Leu. - 
Leu . Thr . Leu . Ser . Arg . Ala ; 

35 (d) the signal polypeptide of unmatured forms of chym- 
osin-like proteins, namely Met. Arg. Cys. Leu. Val.- 
Val . Leu . Leu . Ala . Val . Phe . Ala . Leu . Ser .Gin - Gly ; and 
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15 



20 



(e) two consensus signal polypeptides, namely 

Met.Ser.Lys.Ala.Ala.Leu.Ala.Phe.Ile.Ala.Phe.Val.- 

Ile.Val.Leu.Ile.Val.Asn.Ala and 

Met.Ser.Lys.Phe.Val.Ile.Val.Leu.Ile.Val.Ala.Ala.- 
Leu. Ala .Phe. lie. Ala* Asn. Ala. 

In order to make the conditions for expression as good 
as possible it is advocated to modify the structural 
gene of (b) and/or the signal polypep tide-encoding DNA 
sequence of (f) such that the codons are codons pre- 
ferred by yeasts. Following the teaching of J. P. 
Holland and M.J. Holland ( J.Biol.Chem. 255, 2596-2605 
[1980]) the preferred codons are: 
GCC or GCT alanine 



AGA 
AAC 

GAC or GAT 

TGT 

CAA 

GAA 

GGT 

CAC 

ATC or ATT 



arginine 

asparagine 

aspartic acid 

cysteine 

glutamine 

glutamic acid 

glycine 

histidine 

isoleucine 



TTG 
AAG 
ATG 
TTC 
CCA 

TCC or TCT 
ACC or ACT 
TGG 
TAC 

GTC or GTT 



leucine 
lysine 
methionine 
phenylalanine 
proline 
serine 
threonine 
tryptophan 
tyrosine 
valine. 



25 The DNA sequences described above are not a purpose in 
themselves. They will be used in a process for pre- 
paring yeasts containing these DNA sequences by intro- 
ducing these rDNA sequences into Saccharomyces , 
Kluyveromyces , Debaryomyces , Hansenula , Candida , 

30 Torulopsis or Rhodotorula yeasts, either in the form of 
plasmids or by incorporation in the yeast genome. 

The invention further provides yeasts containing such 
rDNA sequences, either in the form of a plasmid or in- 
35 corpora ted in the yeast genome and their use in a pro- 
cess for preparing a protein by cultivation of such 
yeast, whereby the rDNA sequence incorporated in the 
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yeast contains a structural gene encoding that protein 
or a precursor thereof, which precursor can form the 
relevant protein during processing. 

5 Finally, the invention provides the proteins produced 
by this novel process. 

Several constructions in which the GAPDH promoter/ 
regulation region was combined with structural genes 

10 encoding either preprothaumatin or preprochymosin or 

some of the various maturation products have been made 
and synthesis of thaumatin-like proteins in yeast have 
been demonstrated. Preferred constructions contained 
structural genes encoding either preprothaumatin or 

15 prethaumatin. To improve the expression yield of said 
genes, the original codons can be replaced by codons 
which are abundantly present in highly expressed genes. 
Moreover, the DNA sequence encoding the signal sequence 
of preprochymosin can be replaced by DNA sequences en- 

20 coding signal sequences of products excreted by yeasts, 
such as the signal sequences invertase and acid phos- 
phatase produced by £5. cerevisiae . 

The DNA sequences according, to the invention may com- 
25 prise nucleotide sequences which regulate DNA repli- 
cation in yeasts belonging to the genera Saccharomyces 
and Kluyveromyces . If the replication origin is in- 
serted into an rDNA plasmid, the following combinations 
are preferred for Saccharomyces: a combination of the 
30 replication origin of the 2 micron DNA with the leu 2 
gene as is present on plasmid pMP81 [CP. Hollenberg, 
Current Topics in Microbiol, and Immunol. 9(5, 119-144 
(1982)] and a combination of the replication origin of 
the 2 micron DNA in combination with the trp 1 gene 
35 present on plasmid YRp7 CD.T. Stinchcorab et al., Nature 
282 , 39-43 (1979)]. A preferred replication origin for 
Kluyveromyces consists of the KARS-2 sequence in com- 
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bination with the trp 1 gene as is present on plasmid 
pEK 2-7 described in European Patent Application 
N° 0096430(A1) on pages 21 and 25 and in Fig. 2 
If the DNA sequence has to be inserted into a yeast 
5 genome, it is preferable that the DNA sequence contains , 
the termination region of Pig. 3 downstream of the 
structural gene besides the regulon region of Fig. 2 
upstream of the foreign structural gene, which combin- 
ation will be inserted by homologous recombination at 

10 the position of the GAPDH gene in the yeast genome (cf . 
Fig. 19). An alternative may be that the combination 
regulon region - structural gene - termination region 
is inserted into a cloned DNA sequence derived from 
yeast genome (K. Struhl, Nature 305, 391-397 [1983] and 

15 R.J. Rothstein, Methods in Enzymology 101 * 202-211 
[1983]) . 

For a better understanding of the invention the most 
important terms used in the description will be 
20 defined: 

An operon is a gene comprising (a) a particular DNA 
sequence [structural gene (s) J for polypeptide (s) ex- 
pression, (b) a control region or regulon (regulating 
said expression) upstream of the structural gene and 
25 mostly consisting of a promoter regulation sequence, 

(c) a ribosome binding- or interaction DNA sequence and 

(d) a control region or transcription terminator 
downstream of the structural gene. 

30 Structural genes are DNA sequences which encode through 
a template (mRNA) a sequence of amino acids charac- 
teristic for a specific polypeptide. 

A promoter is a DNA sequence within the regulon to 
35 which RNA polymerase binds for the initiation of the 
transcription. 

A terminator is a DNA sequence within the operon 
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comprising amongst others particular. DNA sequences 
involved in the polyadenylation of mRNA and particular 
DNA sequences involved in the termination of the 
transcription of DNA by RNA-polymerase* 

5 

Reading frame . The grouping of triplets of nucleotides 
(codons) into such a frame that at mRNA level a proper 
translation of the codons into the polypeptide takes 
place. 

10 

Transcription . The process of producing mRNA from a 
structural gene. 

Translation . The. process of producing a polypeptide 
15 from mRNA. 

Expression . The process undergone by a structural gene 
to produce a polypeptide. It is a combination of many 
processes, including at least transcription and 
20 translation. 

By signal sequence (also called signal polypeptide or 
leader sequence) is meant that part of the pre (pro) - 
protein which has a high affinity to biomembranes or 
25 special receptor-proteins in biomembranes and/or which 
is involved in the transport/ translocation of pre (pro) 
protein. These transport/ translocation processes are 
often accompanied by processing Of the pre (pro) protein 
into one of the mature forms of the protein. 

30 

Allelic form . One of the two or more naturally 
occurring alternative forms of a gene product. 

Chromosome . Thread-like structures into which the 
35 hereditary material of cells is associated* 



Genome. The total genetic information of cells or- 
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ganised in the chromosomes. 

Maturation form . One of two or more naturally occur- 
ring forms of a gene product procured by specific. 
5 processing, e.g. specific proteolysis. 

By maturation forms of preprothaumatin are meant 
prothaumatin, prethaumatin (=* preprothaumatin without 
a carboxyl terminal sequence of 6 amino acids) and 
thaumatin (EP-PA 54330 and EP-PA 54331). 
10 By maturation forms of preprochymosin are meant 
prochymosin, pseudo-chymosin and chymosin (EP-PA 
77109) . 

Plus strand * DNA strand whose nucleotide sequence 
15 is identical with the mRNA sequence, with the proviso 
that uridine is replaced by thymidine. 

The microbial cloning vehicles - containing (a) the 
various forms of the regulons and terminators (indi- 

20 cated with the suffix -01, -02, etc. in the plasmids 

described in the specification) of the glyceraldehyde- 
3 -phosphate dehydrogenase (GAPDH) operon of S. 
cerevisiae , (b) structural genes encoding prepro- 
thaumatin and preprochymosin and their various 

25 maturation forms, (c) various hybrid forms of said 

structural genes encoding maturation forms of prepro- 
thaumatin or preprochymosin with special signal 
sequences and (d) various chemically synthesized DNA- 
sequences - are produced by a number of steps, the most 

30 essential of which are: 

1. Isolation of clones containing the GAPDH operon of 
S . cerevisiae . 

2. Isolation of the GAPDH promoter/regulation region 
35 and its introduction into plasmids encoding 

thaumatin-precursors . 

3. Introduction of the GAPDH promoter/regulation 
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region into plasmids encoding chymosin-precursors. 

4. Reconstitution of the original GAPDH promotor/ 
regulation region in plasmids encoding prepro- 
thaumatin by introduction of a synthetic D&A frag- 

5 ment (Fig. 7A, Fig. 8). 

5. Reconstitution of the original GAPDH promoter/ 
regulation region in plasmids encoding 

(pre) (pro) (pseudo)chymosin by introduction of a 
synthetic DNA fragment. 
10 6. DNA-synthesis. 

7. Structural features of the GAPDH promoter /regula- 
tion region. 

8. Insertion of fragments of the GAPDH transcription 
termination/polyadenylation region in combination 

15 with the central transcription termination signal 

of phage M13RF downstream of genes encoding 
pseudochymosin. 

9. Introduction of the 2 micron DNA replication origin 
and the yeast leu 2 gene in plasmids encoding 

20 thaumatin-precursors and chymosin-precursors. 

10. The introduction of GAPDH transcription termination/ 
polyadenylation regions into pURY plasmids. 

11. Construction of an E. coli-yeast shuttle vector 
widely applicable for gene-expression in yeast. 

25 12. Expression in K. lactis of the preprothaumatin 
encoding gene under control of the promoter/ 
regulation region of the GAPDH encoding gene of 
S3, cerevisiae . 

13. Integration of structural genes under control of 

30 the GAPDH promoter/ regulation region into the yeast 

chromosome. 

14. Chemical synthesis of structural genes and con- 
struction of synthetically chimeric genes. 

35 The following detailed description will illustrate the 
invention. 



1. Isolation of clones containing the glyceraldehyde- 
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3-phosphate dehydrogenase (GAPDH) operon of 
cerevisiae « 

A DNA pool of the yeast cerevisiae was prepared 
in the hybrid E. coli -yeast plasmid pPl 1 CM. 
Chevallier et al . Gene 11, 11-10 (1980)] by a method 
similar to the one described by M. Carlson and D. 
Botstein, Cell 28, 145-154 (1982). Purified yeast 
DNA was partially digested with restriction 
endonuclease Sau 3A and the resulting DNA fragments 
(with an average length of 5 kb) were ligated by T4 
DNA ligase in the dephosphorylated Bam HI site of 
pPl 1. After transformation of CaCl 2 -treated 
coli cells with the ligated material a pool of about 
30.000 ampicillin resistant clones was obtained. 
These clones were screened by a colony hybridization 
procedure [R-E- Thayer, Anal. Biochem. , 98, 60-63 
(1979)] with a chemically synthesized and 32 P- 
labelled oligomer with the sequence 
5 ' TACCAGGAGACCAACTT3 ' . 

According to data published by J. P. Holland and M.J. 
Holland [J. Biol. Chem., 255, 2596-2605, (1980)] 
this oligomer is complementary with the DNA sequence 
encoding aminoacids 306-310 (the wobble base of the 
last amino acid was omitted from the oligomer) of 
the GAPDH gene. Using hybridization conditions 
decribed by R.B. Wallace et al. , Nucleic Acid Res. 
9, 879-894 (1981), six positive transf ormants could 
be identified. One of these harboured plasmid pFl 1- 
33. The latter plasmid contained the GAPDH gene 
including its promoter/regulation region and its 
transcription termination/polyadenylation region. 

The approximately 9 kb long insert of pFl 1-33 has 
been characterized by restriction enzyme analysis 
(Fig. 1) and partial nucleotide sequence analysis 
(Figs. 2 and 3). 
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Note: Unless stated otherwise, all enzyme 

incubations were carried out under conditions 
described by the supplier* Enzymes were 
obtained form Amersham, Boehringer, BRL or 
Biolabs. 

2. Isolation of the GAPDH promoter/regulation region 
and its introduction into plasmids encoding 
thaumatin-precursors (Fig. 4) * 

On the basis of the restriction enzyme analysis and 
the nucleotide sequence data of the insert of 
plasmid pFl 1-33 , the RNA initiation regulation 
region of the GAPDH gene was isolated as an 800 
nucleotides long Dde I fragment. To identify this 
promoter fragment, plasmid pFl 1-33 was digested 
with Sal I and the three resulting DNA fragments 
were subjected to a Southern hybridization test 
with the chemically synthesized oligomer [E.M. 
Southern, J.Mol.Biol. 98, 503-517 (1975)]. A posi- 
tively hybridizing 4.3 kb long restriction fragment 
was isolated on a preparative scale by electroelu- 
tion from a 0.7% agarose gel and was then cleaved 
with Dde I. Of the resulting Dde I fragments only 
the largest one had a recognition site for Pvu II, 
a cleavage site located within the GAPDH promoter 
region (Fig. 1) . The largest Dde I fragment was 
isolated and incubated with Klenow DNA polymerase 
and four dNTP's (A.R. Davis et al . , Gene 10, 205- 
218 (1980)) to generate a blunt-ended DNA molecule. 

After extraction of the reaction mixture with 
phenol/ chloroform (50/50 v/v), passage of the 
aqueous layer through a Sephadex G50 column and 
ethanol precipitation of the material present in 
the void volume, the DNA fragment was equipped with 
the 32 P-labelled Eco RI linker 5 1 GGAATTCC3 1 by 
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incubation with T4 DNA ligase. Owing to the Klenow 
DNA polymerase reaction and the subsequent ligation 
of the Eco RI linker, the original Dde I sites were 
reconstructed at the ends of the promoter fragment. 
To inactivate the ligase the reaction mixture was 
heated to 65 °C for 10 minutes, then sodium chloride 
was added (final concentration 50 mmol/l) and the 
whole mix was incubated with Eco RI. Incubation was 
terminated by extraction with phenol/chloroform, 
the DNA was precipitated twice with ethanol, re- 
suspended and then ligated into a suitable vector 
molecule. Since the Dde I promoter fragment was 
equipped with Eco RI sites it can be easily 
introduced into the Eco RI site of pUR 528, pUR 523 
and pUR 522 (EP-PA 54330 and EP-PA 54331) to create 
plasmids in which the yeast promotor is adjacent to 
the structural genes encoding thaumatin precursors. 
The latter plasmids were obtained by cleavage of 
pUR 528, pUR 523 and pUR 522 with Eco RI, treatment 
of the linearized plasmid molecules with (calf 
intestinal) alkaline phosphatase to prevent self- 
ligation and incubation of each of these vector 
molecules, as well as the purified Dde I promotor 
fragment, with T4 DNA ligase. Transformation of the 
various ligation mixes in CaCl 2 -treated E. coli 
HB101 cells yielded several ampicillin resistant 
colonies. From some of these colonies plasmid DNA 
was isolated [H.C. Birnboim and J. Doly, Nucleic 
Acids Res. 7, 1513-1523 (1979)] and incubated with 
Pvu II to test the orientation of the insert. 

In the nomenclature plasmids containing the Eco RI 
(Dde I) GAPDH promoter fragment in the correct 
orientation (i.e. transcription from the GAPDH 
promoter occurs in the direction of a downstream 
located structural gene) are indicated by the 
addendum-01 to the original code of the plasmid 




WO 84/04538 



PCT/EP84/00153 



(for example* pUR 528 is changed into pUR 528-01; 
see Fig. 4) . 

To facilitate manipulation of plasmids containing 
5 the Eco RI promoter fragment, one of the two Eco RI 

sites was destroyed* Two /ug of plasraid DNA (e.g. 
pUR 528-01) was partially digested with Eco RI and 
then incubated with 5 units Mung bean nuclease 
(obtained from P.L. Biochemicals Inc.) in a total 

10 volume of 200 /ul in the presence of 0.05 moles 

per 1 sodium acetate (pH 5.0) , 0.05 moles/l sodium 
chloride and 0.001 moles/l zinc chloride for 30 
minutes at room temperature to remove sticky ends. 
The nuclease was inactivated by addition of SDS to 

15 a final concentration of 0.1% [D. Kowalski et al . , 

Biochemistry 15, 4457-4463 (1976)3 and the DNA was 
precipitated by the addition of 2 volumes of 
ethanol (in this case the addition of 0.1 volume of 
3 moles/l sodium acetate was omitted). Linearized 

20 DNA molecules were then religated by incubation 

with T4 DNA ligase and used to transform CaCl 2 - 
treated E. coli cells. Plasmid DNA isolated from 
ampicillin resistant colonies was tested by 
cleavage with Eco RI and Mlu I for the presence of 

25 a single Eco RI site adjacent to the thaumatin gene 

(cf Fig. 4). 

Plasmids containing the GAPDH promoter fragment, 
but having only a single Eco RI recognition site 
30 adjacent to the ATG initiation codon of a down- 

stream located structural gene, are referred to as 
-02 type plasmids (for example: pUR 528-01 is 
changed into pUR 528-02; see Fig. 4). 

35 3. Introduction of the GAPDH promoter/regulation 

region into plasmids encoding chymosin-precursors. 
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To construct plasmids containing the GAPDH 
promoter/regulation region adjacent to structural 
genes encoding chymosin precursors, use was made of 
plasmid pUR 528-02 (cf 2) digested with Eco RI and 
Hind III as a vector molecule in which various 
structural genes were inserted. To overcome the 
problem that all of the (pre) (pro) (pseudo)chymosin 
encoding plasmids (pUR 1524, pUR 1523 and pUR 1521, 
respectively, as described in EP-PA 77109) contain 
an additional Eco RI site within the structural 
gene, a Hind III digestion was carried out in com- 
bination with a partial Eco RI digest. Restriction 
.fragments containing the intact and isolated gene 
were extracted from the 1% agarose gel and added to 
a T4 DNA ligation mix together with the vector. The 
vector was prepared by digesting pUR 528-02 with 
Hind III and Eco RI, treating the resulting frag- 
ments with phosphatase and isolation of the largest 
fragment from a 0.7% agarose gel. After trans- 
formation of CaCl 2 -treated E; coli cells with the 
ligation mix and selection for ampicillin resistant 
colonies, plasmids containing the GAPDH promoter/ 
regulation fragment adjacent to the (pre)(pro)- 
(pseudo) chyraosin encoding structural genes in the 
appropriate orientation could be isolated. In a 
similar way, plasmid pUR 1522 can be converted into 
plasmid puR 1522-02. 

Plasmids containing the GAPDH promoter fragment and 
the genes coding for prepro-, pro-, pseudochymosin 
and chymosin are referred to as pUR 1524-02, pUR 
1523-02, pUR 1521-02 and pUR 1522-02, respectively. 

Reconstitution of the original GAPDH promoter/ 
regulation region in plasmids encoding preprothau- 
matin by introduction of a synthetic DNA fragment 
(Fig. 7A, Pig. 8). 
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As shown by the nucleotide sequence depicted in 
Fig. 2, the Eco RI (Dde I) GAPDH promoter fragment 
contains the nucleotides -844 to -39 of the origi- 
nal GAPDH promoter/ regulation region. Not contained 
in this promoter fragment are the 38 nucleotides 
preceding the ATG initiation codon of the GAPDH 
encoding gene. The latter (38) nucleotide fragment s 
contains the PuCACACA sequence, which is found in 
several yeast genes. Said PuCACACA sequence located * 
about 20 bp upstream of the translation start site 
[M.J. Dobson et al. , Nucleic Acid Res., 10, 2625- 
2637 (1982)] provides the nucleotide sequence which 
is optimal for protein initiation CM. Kozak, 
Nucleic Acids Res. 9, 5233-5252 (1981)]. Moreover, 
as shown in Fig. 6, these 38 nucleotides allow the 
formation of a small lopp structure which might be 
involved in the regulation of expression of the 
GAPDH gene. 

On the basis of the above-mentioned arguments, in- 
troduction of the 38 nucleotides between the Dde I 
promoter- fragment and the ATG codon of a downstream 
located structural gene was considered necessary to 
improve promoter activity as well as translation 
initiation. 

As outlined in Fig. 7A the missing DNA fragment was 
obtained by the chemical synthesis of two partially 
overlapping oligomers. The Sac I site present in 
the overlapping part of the two oligonucleotides 
was introduced for two reasons: (i) to enable a 
manipulation of the nucleotide sequence immediately 
upstream. of the ATG codon including the construe- ^ 
tion of poly "A- tailed yeast expression vectors (see 
11); (ii) to give a cleavage site for an enzyme 
generating 3 1 -protruding ends that can easily and 
reproducibly be . removed by incubation with T4 DNA . 
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polymerase in the presence of the four dNTP's. 
Equimolar amounts of the two purified oligomers 
were phosphorylated at their 5' -termini, hybridized 
[J.J. Rossi et al > , (1982), J. Biol. Chenw'257, 
5 9226-9229] and converted into a double- stranded DNA 

molecule by incubation with Klenow DNA polymerase 
and the four dNTP's under conditions which have 
been described for double-stranded DNA synthesis 
[A.R. Davis et al . , Gene 10, 205-218 (1980)]. 

10 Analysis of the reaction products by electro- 

phoresis through a 13% acrylamide gel followed by 
autoradiography showed that more than 80% of the 
starting single-stranded oligonucleotides were 
converted into double-stranded material. The DNA 

15 was isolated by passage of the reaction mix over a 

Sephadex G50 column and ethanol precipitation of 
the material present in the void volume. The DNA 
was then phosphorylated by incubation with poly- 
nucleotide kinase and digested with Dde I. To 

20 remove the nucleotides cleaved off in the latter 

reaction, the reacton mix was subjected to two 
precipitations with ethanol. 

As shown in Pig. 8, cloning of the resulting syn- 
25 thetic DNA fragment was carried out by the simul- 

taneous ligation of this fragment and a Bgl II-Dde 
I GAPDH promoter regulation fragment in a vector 
molecule from which the Eco RI site preceding the 
ATG initiation codon was removed by Mung bean 
30 nuclease digestion (cf. 2). The Bgl II-Dde I 

promoter/regulation fragment was obtained by 
digestion of plasmid pUR 528-02 with Dde I and Bgl 
II. Separation of the resulting restriction frag- 
ments by electrophoresis through a 2% agarose gel 
35 and subsequent isolation of the fragment from the 

gel yielded the purified 793 nucleotides long 
promoter/regulation fragment. In the plasmid pUR 



PCT/EP84/00153 



528-02 the nucleotide sequence preceding the ATG 
codon is 5 1 -GAATTCATG— 3' (EP-PA 54330 and EP-PA 
54331) ,.which is different from the favourable 
nucleotide sequence given by M. Kozak [Nucieic 
Acids Res. 9, 5233-5252 (1981)]. Since our aim was 
to reconstitute the original GAPDH promoter/regula- 
tion/protein initiation region as accurately as 
possible, the Eco RI site was removed in order to 
ligate the synthetic DNA fragment to the resulting 
blunt-end. Removal of the Eco RI site was accom- 
plished by Mung bean nuclease digestion of Eco RI- 
cleaved pUR528-02 DNA (see 2). 

Subsequently the plasmid DNA was digested with Bgl 
II and incubated with phosphatase. After separation 
of the two DNA fragments by electrophoresis through 
a 0.7% agarose gel, the largest fragment was 
isolated and used as the vector in which the Bgl 
II-Dde I promoter fragment as well as the -Dde I- 
treated- synthetic DNA fragment were ligated. 
Plasmids in which the Dde I promoter/regulation 
fragment together with the Sac I recognition 
site containing . the synthetic DNA fragment are 
introduced are indicated by the addendum -03 (for 
example: . pUR 528-02 is changed into pUR 528-03). 

Similar results can be obtained with plasmids 
containing one of the maturation forms of prepro- 
thaumatin as the structural gene, i.e.. prethau- 
matin, prothaumatin and thaumatin, which will 
result in plasmids pUR 522-03, pUR 523-03 and 
pUR 520-03, respectively. 

In order to reconstitute the original GAPDH 
promoter/ regulation region as accurately as 
possible, the Sac I site was removed from plasmid 
pUR 528-03. (Fig. 9). This was accomplished by 
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digestion of the plasmid DNA with Sac I to generate 
a linearized plasmid molecule with protruding 3' 
ends. These ends were then made blunt-ended using 
the S'-exonuclease activity. of T4 DNA polymerase in 
5 the presence of the four dNTP's [T. Maniatis et al. 

in Molecular Cloning; Cold Spring Harbor Laboratory 
117-120 (1982) ]• Circularisation of the linear 
plasmid was accomplished by using T4 DNA ligase. 

10 Plasmids from which the Sac I site present in the 

synthetic DNA fragment is removed are referred to 
as -04 type plasmids (for example: pUR 528-03 is 
changed into pUR 528-04; see Fig. 9). 

15 Similar results can be obtained with plasmids con- 

taining a structural gene for one of the maturation 
forms of preprothaumatin, i.e. prethaumatin, pro- 
thaumatin and thaumatin, which will result in e.g. 
plasmids pUR 522-04, pUR 523-04 and pUR 520-04, 

20 respectively. 

5. Reconstitution of the original GAPDH promoter/ 

regulation region in plasmids encoding (pre) (pro) 
(pseudo)chymosin by introduction of a synthetic DNA 
25 fragment (Fig. 7B, Fig. 10). 

To construct (pre) (pro) (pseudo)chym6sin encoding 
plasmids containing the -03 type GAPDH promoter/ 
regulation region (see 4), use can be made of plas- 

30 mid pUR 528-03 digested with Sac I and Hind III as 

a vector molecule in which the various structural 
genes were inserted together with a synthetic DNA 
fragment (Fig. 10). The (pre) (pro) (pseudo) chymosin 
encoding genes can be isolated from the plasmids 

35 pUR 1524, pUR 1523, pUR 1521 and pUR 1522 (cf. 

EP PA 77109) respectively by incubation with Sal I 
in combination with a partial Eco RI digestion to 
overcome cleavage of the additional Eco RI site in 
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in the chymosin gene (cf 3). The resulting DNA 
fragments can then be incubated with Mung bean 
nuclease (cf 2) followed by a digestion with Hind 
III. Restriction fragments containing the intact 
and isolated gene can be purified by electro- 
phoresis through a 1% agarose gel and then 
extracted from the gel. The synthetic DNA fragment 
to be used in the constructions is depicted in Fig, 
7B. 



The vector molecule can be prepared by digestion of 
plasmid pUR 528-03 with Sac I and Hind III followed 
by a phosphatase treatment of the restriction frag- 
ments and isolation of the largest fragment from 

15 the 0-7% agarose gel. Vector molecule/ synthetic 

DNA fragment (Sac I -treated) and the DNA fragment 
containing the (pre) (pro) (pseudo) chymosin encoding 
nucleotide sequence can be incubated with T4 DNA 
ligase and transformed in CaCl 2 -treated E * coli 

20 cells. Plasmid DNA obtained from ampicillin- 

resistant colonies can be tested by incubation with 
various restriction enzymes. 

The nomenclature of the newly created plasmids is 
25 similar to the nomenclature of the (pre) (pro)- 

thaumatin-encoding plasmids (pUR 1524-03, pUR 1523- 
03, pUR 1521-03 and pUR 1522-03). Removal of the 
Sac I site has been described also (see 4). Removal 
of the Sac I sites will result in the plasmids pUR 
30 1524-04, pUR 1523-04, pUR 1521-04 and pUR 1522-04, 

respectively (see Fig. 10). 

6. DNA synthesis. 

35 Desired oligonucleotides were synthesized on a 

polystyrene support using phosphotriester method- 
ology and a library of dimers. CG.A. van der Marel 
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et al.i Recl.Trav.Chim. Pays-Bas 101 , 234-241 
(1982)]. Chloromethylated polystyrene (1.34 
mmoles/gram) was functionalized with 2- (4- 
hydroxy Iphenyl ) ethanol [Su-Sun W<*ng, J.AmlChera. 
Soc. 95, 1325-1333 (1973)] and then coupled with 5' 
phosphorylated uridine protected as a mixture of 2 1 
and 3* acetate and levulinyl groups (analogously to 
G.A. van der Marel et al. vide supra). This gave a 
support with a levulinyl functionality of 130 
/umoies/g). The synthesis cycle was carried out 
in a 20 ml 2-necked pear shaped flask, one neck of 
which carried a sintered glass filter. This allowed 
the functionalized polystyrene (60 mg) to remain in 
the flask throughout the synthesis which consisted 
of the following steps. (cf. Fig. 11). 

1) . Removal of the "levulinyl" group with hydrazine 

hydrate (0.5 mol/l) in propionic acid/pyridine 
(1:3 v/v) for 5 minutes. 

2) . Washing with 2 x 2 ml pyridine. 

3) . Addition of a fourfold molar excess of the 

required dinucleotide anion (60 /umoles) 
protected in the 5' position as levulinyl 
ester. This was coevaporated twice with 
pyridine and reduced to approximately 0.5 ml 
before adding MSNT (240 /umoles) [C.B. Rees 
et al. , Tetrahedron Letters 2727-2730 (1978)]. 
The mixture was shaken for 60 minutes (except 
the first cycle which was lengthened to 90 
minutes) . 

4) . Washing with 2 x 2 ml pyridine. 

5) . Addition of acetic anhydide (0.2 ml) and 

dimethyl aminopyridine in pyridine (1.5 ml, 
0.05 M) to react with any unreacted hydroxy 1 
group during 5 minutes . 

6) . Washing with 2 x 2 ml pyridine. 

This cycle was repeated with the appropriate 
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dimers until the desired sequence had been 
prepared. The 2- chlorophenyl phosphate 
protecting groups were removed with 0.3 mol/l 
1,1,4,4-tetramethyI guanidinium 2- 
pyridinealdoximate (C.B. Reese et al. , vide 
supra ) in dry acetonitrile for. 24 hours. After 
filtration and washing of the support the base 
protecting groups (dA = benzoyl, dC - anisoyl, 
dG * diphenylacetyl) were cleaved with 
concentrated aqueous ammonia for 60 hours at 
50° which also cleaves the sequence from the 
uridine attached to the support [R. Crea and T. 
Horn, Nucleic Acids Res., 8, 2331-2348 (1980) ] . 
Filtration of the aqueous phase from the 
support and evaporation gave a crude mixture 
which was given a clean-up by chromatography on 
a short column (40 cm) of Sephadex G50 with 
0.05 M triethylammonium bicarbonate as eluent. 
Collecting and evaporating the first five 
fractions containing UV absorbing material gave 
a concentrate suitable for preparative gel 
electrophoresis. 

Structural features of the GAPDH promoter/regu- 
lation region (Fig. 2, Fig. 6) 

TATA or TATAAA sequences are believed to play an 
important role in positioning RNA initiation sites 
in eucaryotic promoter structures which are 
recognized by RNA polymerase II [C. Breathnach and 
P. Chambon, Annu. Rev. Biochem. ^50, 349-384 

(1981) ]. Usually transcription is initiated 25 to 
30 nucleotides 3* to such sequences although for 
yeast varying distances (up to 70 nucleotides; M.J. 
Dobson etal . , Nucleic Acids Res. 1£, 2625-2637 

(1982) ) have been described. According to the 
nucleotide sequence data obtained during the 
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present investigations for the GAPDH promoter/regu- 
lation region, this structure contains two 
additional sets of TATA and TATAAA sequences which 
are located towards the ends of the promoter 
5 fragment (Fig. 2). Most likely the clustered. 

5 1 TATATAA 3 1 sequence occurring around position -130 
is responsible for transcription of the GAPDH 
encoding gene. The two other TATA and TATAAA 
sequences present around positions -608 and -770 

X0 are possibly involved in regulation of GAPDH 

expression (see below; P.K. Maitra and Z. Lobo, J. 
Biol. Chenw , 246, 489-499 (1971)). Besides these 
RNA initiation signals, the GAPDH promoter/regula- 
tion fragment contains various nucleotide sequences 

15 which are implicated in transcription termination. 

K.S. Zaret and F. Sherman [Cell 28, 563-573 (1982)] 
found the sequences TAG-N-TAGT-N 1 -TTT or TAG-N"- 
TATGT-N" 1 -TTT, in which N,N',N M and H" 1 represent 
the variable distances between the groups, in the 

20 3' flanking sequences of a majority of yeast genes 

examined. Identical or similar sequences occur in 
the GAPDH promoter/regulation fragment around 
position -625 (TAT-N4-TAGT-N5-TTT) , around 
position -324 (TAC-N 13 -TAGT-N 17 -TTA, around 

25 position -192 ( TAC-Ng-TATGT-N^-TTT ) and around 

position -180 (TAT-N 17 -TAGT-N 23 -TTT) . In Fig. 6 
these postulated termination signals are repre- 
sented by slashes. The largest open translation 
reading frame present on the GAPDH promoter/regula- 

30 tion fragment extends from position -450 to -337 

and encodes a peptide of 38 amino acids long. Since 
the TATA box around position -608 precedes the open 
reading frame it might well be that the putative 
peptide is translated from a transcript initiated 

35 downstream of this TATA sequence. Particularly 

interesting is the observation that the ATG 
initiation codon of the peptide forms part of a 
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secondary structure which is generated upon base 
pairing of the nucleotides extending from -448 to 
-436 with the nucleotides extending from -419 to 
-407 • These features are reminiscent of the 
situation described for the yeast leu 2 gene [A. 
Andreadis et al. , Cell 3±, 319-325 (1982)3 and 
together with the presence of the small stem/loop 
structure preceding the ATG codon of the GAPDH 
encoding gene they might be involved in regulating 
the expression of the latter gene. 

8. Insertion of fragments of the GAPDH transcription 
termination/polyadenylation region in combination 
with the central transcription termination signal 
of phage M13 RF downstream of genes encoding 
pseudochymosin (Fig. 12). 

To isolate fragments of the GAPDH transcription 
termination/polydenylation region, plasmid pFl-33 
was cleaved with restriction enzyme Afl II. Diges- 
tion with this enzyme yielded two fragments, the 
smaller of which has a length of 1307 nucleotides 
and encompasses the GAPDH termination/ polyadenyla- 
tion region from position 11 to 1317 (Fig. 12). The 
latter fragment was isolated, incubated with Mung 
bean nuclease to generate blunt ends (cf 2) and 
then equipped with the Bam HI linker 
(S'CCGGATCCGGS' ) using T4 DNA ligase.. Owing to the 
presence of a naturally occuring Bam HI site in the 
middle (around position 690) of the Afl II fragment 
isolated, digestion of the ligation mix with Bam HI 
resulted in two fragments (A and. B? Fig. 12). The 
larger of these two fragments . (A) has a length of 
677 nucleotides and contains the nucleotide 
sequence region which is located immediately down- 
stream of the TAA translation termination codon. 
Attempts to subclone the purified fragment A in the 
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the Bam HI site of pBR322 proved to be unsuccess- 
ful since very few transf ormants were obtained and 
these transf ormants always contained fragment A as 
well as fragment B in various orientations • (cf. 
5 plasmid 294-17 depicted in Fig. 12). 

To overcome this instability of fragment A, use 
was made of the central transcription termination 
signal of bacteriophage M13 RF [L. Edens et al . , 

10 Nucleic Acids Res. 2, 1811-1820 (1975)]. M13 RF was 

digested with Taq I and the resulting fragments 
were made blunt end by incubation with Mung bean 
nuclease. The fragments were then equipped with a 
Hind III linker ( 5 ' AGAAGCTTCT3 1 ) using T4 DNA 

15 ligase followed by a digestion with Hind III. The 

DNA was precipitated from the reaction mix by the 
addition of two volumes ethanol. The precipitate 
was resuspended and the various restriction 
fragments were separated by electrophoresis through 

20 a 4% acryl-amide gel. From this gel the 441 

nucleotides long fragment containing the central 
transcription termination signal was isolated. 

After cleavage of the purified fragment with Sau 
25 3A, the fragment containing the nucleotides 1509 to 

1717 [P.M.G. van Wezenbeek et al . , Gene 11, 129-148 
(1980)] was closed in pBR322 which had been 
digested with Hind III and Bam HI (Fig. 12). 

30 To create plasmids in which the 3 1 -untranslated 

region of the genes encoding (pre) (pro) (pseudo) 
chymosin was replaced by the 3 1 -untranslated region 
of the GAPDH encoding gene, plasmid pUR 1521-02 was 
isolated from an E. coli strain deficient in 

35 adenine methylase and cleaved with restriction 

endonuclease Bel I. This enzyme recognizes the (un- 
methylated) sequence 5 1 TGATCA 3' and cleaves the 
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(pre) (pro) (pseudo)chymosin encoding genes within 
their translation termination codon TGA. To 
generate suitable vector molecules in which both 
fragment A and the Ml 3 transcription termination 
5 signal could be cloned, the Bel I-cleaved plasmid 

molecules were incubated first with Sal I, followed 
by alXaline phosphatase. The larger of the two 
restriction fragments generated was isolated from 
the 0.7% agarose gel. For the isolation of fragment 

10 A, plasmid 294-17 was digested with Hind III and 

cleaved partially with Bam HI. The 1020 nucleotides 
long Hind III-Bam Hi fragment containing the Hpa I 
site was obtained by electro-elution from the 1% 
agarose gel. The latter fragment, the vector and 

15 the 485 nucleotide long Hind Ill-Sal I fragment 

obtained from pBR322 containing the M13 tran- 
scription termination signal were incubated with T4 
DNA ligase and transformed in CaCl 2 -treated E. 
coli cells. This finally yielded the plasmid pUR 

20 1521-12. Similar results can be obtained with plas- 

mids pUR 1524-02, pUR 1523-02 and pUR 1522-02, 
which will result in plasmids pUR 1524-12, pUR 
1523-12 and pUR 1522-12, respectively. 

25 In the nomenclature plasmids containing fragment A 

and the M13 transcription termination signal down- 
stream of a newly inserted structural gene are 
indicated by replacing the addendum -Ox with -lx. 

30 9.. Introduction of the 2 micron DNA replication origin 
and the yeast, leu 2 gene in plasmids encoding 
thaumatin-precursors and chymosin-precursors (Fig. 
13, Fig. 14). 

35- The E. coli-yeast shuttle vector pMP81 [Fig. 13- ; 

CP. Hollenberg, Current Topics in Microbiol, and 
Immunol., 96, 119-144, (1982)3 consists of plasmid 



UK WIrO 



WO 84/04538 



3 5 



PCT/EP84/00153 



10 



15 



20 



25 



30 
35 



pCRI [C. Covey et al ., MGG, 145 , 155-158 (1976)] 
and a double Eco RI fragment of pJDB 219 [J.D. 
Beggs, Nature, 275, 104-109 (1978)] carrying both 
the leu 2 gene and the yeast 2 micron DNA • 
replication origin. The latter two functions can 
be excised from pMP81 by a digestion with Hind III 
and Sal I. The resulting 4-4 kb long restriction 
fragment was introduced into the various pBR 322 
derivatives containing genes encoding thaumatin- 
precursors in combination with the various forms of 
the GAPDH promoter region of S. cerevisiae . A 
similar procedure was used to introduce the 4.4 kb 
long restriction fragment into the various pBR322 
derivatives containing genes encoding chymosin- 
precursors in combination with the GAPDH promoter 
region of S. cerevisiae . 

The introduction of the 2 micron replication origin 
and leu 2 gene containing Hind Ill-Sal I fragment 
into the various plasmids was accomplished by 
cleavage of the E. coli plasmids with Hind III and 
Sal I (cf . Fig. 14) and a subsequent treatment of 
the resulting fragments with phosphatase. After 
separation of the fragments by electrophoresis 
through a 1% agarose gel, the largest fragment was 
isolated, mixed with the purified Hind Ill-Sal I 
fragment obtained from pMP81, ligated with T4 DNA 
iigase and transformed to CaCl 2 -treated E. coli 
cells. 

In the pseudochymosin encoding plasmid containing 
termination fragment A (pUR 1521-12; cf. Pig. 12), 
the 4.4 kb long Hind Ill-Sal I fragment was in- 
serted as well. For this purpose plasmid pUR 
1521-12 was digested with both Hind III and Sal I 
(to remove the Ml 3 transcription termination 
region) after which the 2 micron origin and leu 2 




WO 84/04538 



PCT/EP84/00153 



gene containing fragment from pMP81 could be in- 
serted. Unexpectedly, this plasmid construction 
(pURY 1521-12) proved to be stable in E. coli . 



From some of the ampicillin-resistant transformants 
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plasmid DNA was isolated and subjected to restric- 
tion enzyme analysis • Plasmids containing the cor- 
rect insert were purified by CsCl-ethidium bromide 
density gradient centrifugation and used to trans- 
form S. cerevisiae AH22 (A. Hinnen et al. , 
Proc. Natl. Acad. Sci. USA T5, 1929-1933 (1978)) 
according to the procedure of J.D. Beggs, Nature 
275 , 104-109 (1978). This resulted in plasmids 
indicated by the letter code pURY but having the ■ 
same figure codes (cf. Fig. 14 in which the con- 
version of plasmid pUR 528-03 into pURY 528-03 is 
indicated)* Similarly, plasmids pUR 522-02, pUR 
523-02,' pUR 1524-02, pUR 1523-02 and pUR 1521-02 
were converted into their corresponding pURY plas- 
mids . 

With the availability of yeast transformants con- 
taining the newly constructed plasmids, the effects 
of plasmid variation on gene expression could be 
monitored. For this purpose an enzyme-linked im- 
munosorbent assay [Elisa: A Voller et al., Bull. 
World Health Org<an. 53, 55-65 (1976)] for the 
thaumatin was developed and used to quantitate the. 
amounts of thaumatin-like protein present in yeast 
extracts. The results obtained in these experiments 
show that upon the introduction of the -02 or -03 * 
type GAPDH promoter in pURY 528 (cf. 2, 4), 

thaumatin synthesis is increased by more than two * 
orders of magnitude (more than 100 times) . Upon the 
introduction of the 280 nucleotides long promoter 
fragment described by J-P. Holland and M.J. Holland 
[J.Biol. Chem. , 255, 2596t2605, (1980)], however, 
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thaumatin synthesis increased by order of magnitude 
of only one (about 10 times), hereby demonstrating 
the important role played by upstream sequences of 
the GAPDH promoter in its functioning, which is 
contrary to the conclusions of Holland and Holland 
[cf. J.Biol.Chem. 254, 9839-45 (1979)]. 

In another experiment the expression of prepro- 
thaumatin and prothaumatin encoding genes was com- 
pared. As expression of the prothaumatin encoding 
gene turned out to be almost negligible, this 
result clearly demonstrated the enormous impact 
which signal sequences can have on gene expression. 
The importance of the processing step was further 
substantiated by the results shown in Fig. 20. The 
latter experiment demonstrates that yeast cells 
harbouring plasmids encoding preprothaumatin are 
able to produce a thaumatin-like protein with a 
molecular weight which is practically the same as 
the molecular weight of thaumatin II molecules 
isolated from the plant. 

This suggests that yeast is able to correctly pro- 
cess the plant signal sequence. Recently obtained 
data on the amino-terminal amino acid sequence of 
yeast-synthesized thaumatin have provided definite 
evidence for this notion. 

In conclusion, the results obtained strongly sug- 
gest that the signal sequence plays an important 
role in either stimulating protein synthesis or in- 
creasing protein stability in yeasts. It is not yet 
sure whether prothaumatin or thaumatin is produced 
by yeasts in view of the smaller difference in 
molecular weight (about 3%) between these proteins, 
whereas the difference with preprothaumatin (about 
10%) is easily detectable as was demonstrated in 
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Fig. 20. 

10- The introduction of GAPDH transcription termin- 
ation/polyadenylation regions into pURY plasmids. 

5 

One of the possibilities to introduce GAPDH tran- 
scription termination/polyadenylation regions into 
pURY plasmids is to use the unique Hind III re- 
striction site of these plasmids for insertion of 

10 various parts' of the Af.1 II fragment depicted in 

Fig. 12. For example, the insertion of tran- 
scription termination fragment A (cf . Fig. 12) can 
be carried out as follows. Cleavage of plasmid pURY 
528-03 with Hind III and a subsequent Mung bean 

15 nuclease digestion (cf. 2) will yield a linearized 

and blunt-ended plasmid molecule (cf. Fig 14). In- 
cubation of this DNA with T4 DNA ligase and a suit- 
able Bam HI linker will equip the fragment with Bam 
HI sites (cf. 8).. Upon transformation of the Bam 

20 Hl-cleaved and religated product into E. coli , pURY 

528-03 plasmids in which the Hind III site is re- 
placed by a Bam HI site can be recovered. Into this 
newly created Bam HI site transcription termination 
fragment A can be inserted. Digestion of the latter 

25 construction with Hpa I will indicate whether or 

not fragment A is inserted at the correct orien- 
tation, since in the 2 micron DNA sequence an 
additional Hpa I site is available. 

30 * Using the same approach but different linker mole- 

cules, different terminator fragments can be intro- 
duced at various sites in the plasmid molecule. 

11. Construction of an E. coli- yeast shuttle vector 
35 widely applicable for the expression of foreign 

genes in yeast (Fig. 15). 

Derivatives of the E. coli -yeast shuttle vectors 
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described under 5 and 9, useful as generally appli- 
cable yeast expression vectors, can be prepared. 
In these derivatives the GAPDH promoter/regulation 
region including a chemically synthesized DNA frag- 
ment can be used for DNA initiation, whereas tran- 
scription termination can be effected by the tran- 
scription termination/polyadenylation region of the 
"Able" operon of the 2 micron DNA. Insertion of 
foreign genes in these expression vectors can be 
accomplished by homopolymer (poly dA) tailing of 
the Sac I site in the promoter and the Hind III 
site present in the "Able" operon in combination 
with poly dT tailing of the nucleotide sequence to 
be inserted. 

For the preparation of the vector molecule, plasmid 
pURY 528-03 can be cleaved with Hind III and then 
incubated with Klenow DNA polymerase and the four 
dNTP's to generate a blunt-ended, linearized DNA 
molecule (Fig. 15). Subsequently this DNA molecule 
can be cleaved with Sac I and of the two resulting 
DNA fragments the larger can be isolated from a 
0.7% agarose gel and incubated with terminal 
transferase and dATP under conditions described by 
G. Deng and R. Wu [Nucleic Acids Res. 9, 4173-4188 
(1981)]. The time of incubation has to be such that 
the tail added to the 3 'end generated by cleavage 
with Sac I has a length of about 20 dATP residues. 

The introduction of a foreign gene into the poly 
dA-tailed expression vector can be carried out by 
incubating the DNA containing this gene with a set 
of restriction enzymes such that the desired gene 
can be cleaved as close as possible upstream of the 
translation initiation codon and downstream of the 
translation termination codon of this gene. Since 
promoter regions can be preceded by transcription 
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termination signals, it is important that the ori- 
ginal promoter is not contained within the resul- 
ting DNA fragment. After purification, the trimmed 
DNA fragment can be equipped with poly dT tails. 

If the gene to be inserted is obtained by reverse 
transcription of an mRNA molecule, the poly dT 
tails can be directly added to the SI nuclease- 
treated double-stranded DNA molecule. In all cases 
the time of incubation with terminal transferase 
must be chosen such that poly dT tails with a 
length of about 20 nucleotides are generated. The 
DNA to be inserted and the poly dA- tailed vector 
molecule can then be hybridized by incubation at 
65 °C for 10 minutes, followed by cooling down the 
hybridization mixture slowly to room temperature. 
The mixture can be subsequently transformed into 
CaCl 2 ^treated E. coli cells. Plasraid DNA isolated 
from ampicillin-resistant transformants can then be 
used to transform yeast cells. 

12. Expression in K. lactis of the preprothaumatin 

encoding gene under control of the promoter/ regula- 
tion region of the glyceraldehyde-3-phosphate 
dehydrogenase encoding gene of S. cerevisiae 
(Fig. 16, Fig. 17) . 

The E. coli -yeast shuttle vector pEK 2-7 consists 

of plasmid YRp7 [D.T. Stinchcomb et al., Nature 

282 , 39-43 (1979)] containing the 1.2 kb KARS-2 

fragment. Owing to the presence of the yeast trp 1 ■* 

gene, plasmid pEK 2-7 can be maintained in K. 

lactis SD11 (lac 4, trp 1; cf . pages 18 and 19 of * 

European Patent Application N° 0 096 910; Al). 

To demonstrate the functionality of the promoter/ 
regulation region of the GAPDH encoding gene in 
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K. lactis, 'plasmid pUR 528-03 (Fig. 8) has been, 
equipped with both KARS-2 and the trp 1 gene- The 
latter two functions were excised from pEK 2-7 by a 
digestion- with Bgl II followed by # the isolation 
from a 0.7% agarose gel of the smallest fragment 
generated. This purified fragment was then inserted 
in the dephosphorylated Bgl II cleavage site of pUR 
528-03 by incubation with T4 DNA ligase. Trans- 
formation of the ligation mix in CaCl 2 -treated E. 
coli cells yielded plasmid pURK 528-03 (Fig. 16). 
Transformants generated by the introduction of the 
latter plasmid into K. lactis SD 11 cells by the 
procedure described in European Patent Application 
N° 0 096 910; Al could be shown to synthesize pre- 
prothaumatin (Fig. 17). 

By techniques similar to those mentioned above/ 
plasmid pURY 528-03 was also equipped with KARS-2 
and the yeast trp 1 gene and introduced into K. 
lactis SD 11 (Fig. 16 ). Using the same detection 
procedure, K. lactis cells carrying pURK 528-33 
could also be shown to synthesize preprothaumatin 
(Fig. 17). Preprothaumatin production in cells 
containing pURK 528-33 was, however, slightly 
higher than in cells containing pURK 528-03. Since 
similar observations have been made by C. Gerbaud 
et al. [Gene 5, 233;253 (1979)] in the expression 
of the yeast ura 3 gene upon insertion of this gene 
within the coding region "Able" of the 2 micron DNA, 
it is very likely that the enhanced expression of 
preprothaumatin by pURK 528-33 is due to efficient 
transcription termination events in the tran- 
scription termination/polyadenylation region of the 
"Able" operon. This observation indicates that the 
presence of an efficient transcription termination/ 
polyadenylation region downstream of a structural 



gene transcribed by the GAPDH promoter/regulation 
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region is an important factor in optimizing gene 
expression. 

13 • Integration of structural genes under the control 
5 of the GAPDH promoter/regulation region into the 

yeast chromosome (Fig. 18, Fig. 19). 

Integration of DNA sequences encoding either a 
heterologous or homologous structural gene under 

10 transcriptional control of the GAPDH promoter/ 

regulation region and, optionally, the GAPDH ter- 
mination/polyadenylation region into the yeast 
genome, can be achieved on the basis of techniques 
described by R..J. Rothstein [in Methods in Enzym- 

15 ology 101, 202-211 (1983)] and K. Struhl [Nature 

305 , 391-397 (1983)]. The criteria to apply these 
techniques are the availability of (i) suitable 
marker genes, (ii) cloned DNA sequences homologous 
with DNA sequences present on the yeast genome and 

20 (iii) the availability of an intact, homologous or- 

heterologous, protein-encoding DNA sequence. 

Having the marker genes (e.g. leu 2, trp 1, his 3) 
and the protein-encoding DNA sequence (e.g. the 

25 sequences encoding thaumatin-precursors or 

chymosin-precursors) available, the latter 
sequences can be integrated into the yeast genome 
by homologous recombination events either between 
GAPDH promoter and terminator sequences (cf. Fig. 

30 19) or between the marker genes. In the latter. 

approach, which offers a variety of integration 
sites, the foreign protein encoding DNA sequence is 
integrated within the "wild type" marker gene, 
hereby destroying the function of this gene. 
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14. Chemical synthesis of structural gene. 

•Construction of synthetically chimeric genes. 
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Expression experiments of the various structural 
genes of preprothaumatin and preprochymosin - 
located on the 2/um DNA vector and under control 
of the GAPDH promoter /regulation region and the 
5 GAPDH termination region - in S. cerevisiae 

resulted in an expression yield that was considered 
not yet economically attractive for the fermen- 
tative production of these proteins. The expression 
of preprothaumatin and preprochymosin genes located 
IQ on the KARS-vector did not match economic feasi- 

bility either. 

Another problem observed during expression 
experiments was that preprothaumatin was processed 
15 correctly in both S. cerevisiae and K. lactis into 

thaumatin (Fig. 20) (it is not clear whether the 6 
amino acids on the C-terminus are also removed 
during processing); however/ such a correct pro- 
cessing could not be detected with preprochymosin. 

20 

Therefore it is advocated to make preprochymosin 
and preprothaumatin in the preferred codons of 
S. cerevisiae [J. P. Holland and M.J. Holland; J. 
Biol.Chem. 255/ 2596-2605 (1980)] to increase the 
25 expression-yield. The syntheses of these genes can 

be carried out according to the methods described 
under 4 and 6. The nucleotide sequences of chymosin 
and thaumatin in a preferred codon usage are given 
in Fig. 21 and Fig. 22. 

30 

As described under 4 and 6, the synthesis can be 
carried out by making small (up to 30 nucleotides) 
single strand DNA fragments with partial over- 
lapping sequences. This method makes it possible 
35 that also the various maturation forms of both 

genes are obtained simultaneously (cf. Fig. 21-22). 
Moreover, the approach adopted is suitable for 
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making changes in parts of the sequences . One can 
use this possibility to replace the leader sequence 
of preprochymosin (nucleotides -174 to - 127 , 
Fig. 21) by various other leader sequences. Two 
typical yeast leader sequences are that of acid 
phosphatase [K. Arima et al. , Nucleic Acids Res. 
11, 1657-1672 (1983)] and that of invertase [R. 
Taussig and M. Carlson, Nucleic Acids Res. 11, 
1943-1954 (1983)]. Examples of nucleotide sequences 
encoding both leader sequences with codons pre-* 
f erred by yeasts are given in Fig. 23, whereas Fig. 
24 gives schematically two designed chimeric acid 
phosphatase/prochymosin and invertase/prochymosin 
genes. Because preprothaumatin was processed 
correctly by the yeast cells, we also designed a 
chimeric gene of prochymosin and the leader 
sequence of the preprothaumatin gene (Fig. 24). 
Based on the physico-chemical considerations about 
the nature of the yeast leader sequences and the 
interaction of these leader sequences with the 
signal recognition protein [P. Walter and G. 
Blobel, Proc. Natl. Acad. Sci. USA, 77, 7112-7116 
(1980)], we also designed two consensus leader 
sequences (Fig. 25) which can be used to make 
chimeric genes of these consensus sequences with 
the prochymosin gene (Fig. 24). 
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Legends to the figures 

Fig. 1 Restriction endonudease cleavage map of a region of plasmid 
pFl 1-33 containing a yeast glyceraldehyde-3-phosphate 
dehydrogenase operon. 

Fig, 2 Nucleotide sequence of the promoter/ regulation region of a 
glyceraldehyde-3-phosphate dehydrogenase operon cloned in 
pFl 1-33. TATA and TATAAA sequences are indicated by solid 
underlining. Presumptive transcription termination signals are 
underlined with dots* Nucleotide sequences between brackets 
indicate inverted repeats. The nucleotide sequence encoding the 
38 amino acids long peptide is enclosed in a box. 

Fig. 3 Nucleotide sequence of the transcription termination/ 
polyadenylation region of a glyceraldehyde-3-phosphate 
dehydrogenase operon cloned in pFl 1-33. AATAA sequences are 
indicated by solid underlining. Presumptive transcription 
termination signals are underlined with dots. 

Fig. 4 Schematic representation of the insertion of the Eco RI (Dde I) 
GAPDH promoter/regulation fragment in the preprothaumatin 
encoding plasmid and removal of an Eco RI cleavage site from 
the resulting plasmid. 

Fig. 5 Schematic representation of the construction of (pre)(pro)- 

(pseudo)chymosin encoding plasmids containing the Eco RI (Dde I) 
GAPDH promoter/ regulation fragment • 
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Fig. 6 Schematic representation of the structure of the GAPDH 

promoter/ regulation region including potential stem and loop 
structures* Presumptive transcription termination signals are 
indicated by slashes* The position of the coding sequence for 
the 38 amino acids long peptide on the fragment is shown by ATG 
and TAA codons. 

Fig. 7a Representation of the various steps involved in the preparation 
of the synthetic DNA fragment used to reconstitute the 
original GAPDH promoter/regulation region upstream of pre- 
prothaumatin encoding nucleotide sequences. 

Fig. 7b Representation of the synthetic DNA fragment used to 

reconstitute the original GAPDH promoter/ regulation region 
upstream of (pre) (pro) (pseudo)chymosin encoding nucleotide 
sequences. 

Fig. 8 Schematic representation of the introduction of the synthetic 
DNA fragment in preprothaumatin encoding plasmids. 



Fig. 9 Schematic representation of the removal of the Sac I site from 
the reconstituted GAPDH promoter/regulation region. 

Fig. 10 Schematic representation of the introduction of the synthetic 
DNA fragment in (pre)(pro)(pseudo)chymosin encoding plasmids. 

Fig. 11 General scheme for synthesis of DNA fragments on a polystyrene 
support . 
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Fig. 12 Schematic representation of an Afl II-Bam HI transcription 

termination/ polyadenylation fragment obtained from pFl 1-33 
and its insertion in combination* with the M13 central 
transcription termination signal, downstream of the nucleotide 
sequence encoding pseudochymosin 

Fig. 13 Schematic representation of plasmid pMP81. 

Fig. 14 Schematic representation of the introduction of the 2 micron 
DNA origin of replication and the leu 2 gene obtained from 
pMP81 by digestion with Hind III an Sal I into preprothaumatin 
encoding plasmids. 

Fig. 15 Schematic representation of an E. coli-yeast shuttle vector 
widely applicable for the expression of foreign genes in 
yeast. 

Fig. 16 Schematic representation of the introduction of the KARS-2 and 
trp 1 gene obtained from pEK 2-7 by digestion with Bgl II into 
prepro-thaumatin encoding plasmids. 

Fig. 17 Fluorogram of (^S) labelled thaumatin-like proteins 

synthesized by K. lactis SD11 cells containing plasmid pEK 2-7 
(lane a), plasmid pURK 528-03 (lane b) and plasmid pURK 528-33 
(lane c) Yeast transformants were grown for 3 hours on a 
minimal medium containing 35 S-cysteine. Cells were collected 
by centrifugation, resuspended in 1 ml 2.0 mol/1 sorbitol, 
0.025 mol/1 NaPO^ pH 7.5, 1 mmol/1 EDTA, 1 mmol/1 MgCL 2 , 
2.5%/3-mercaptoethanol, 1 mg/ml zymolyase 60.000 and incubated 
for 30 minutes at 30°C. Spheroplasts were then centrifuged 
and lysed by the addition of 270/U H 2 0, 4/^1 100 mmol/1 
FMSF, 8/4l 250 mmol/1 EDTA, 40/^1 9% NaCl and 80yt*l 5x PBSTDS 
(50 mmol/1 NaPO^ pH 7.2, 5% Triton X 100, 2.5Z deoxycholate , 
2,5 X SDS). Immunoprecipitation of thaumatin-like proteins and 
analysis of precipitated proteins was carried out as described 
by L. Edens et al, Gene 18, 1-12 (1982). 
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Fig. 18 Schematic representation of plasmid pEK 2-7. 

Fig. 19 Schematic representation of a plasmid which can be used for the 
insertion of a pseudochymosin-encoding nucleotide sequence 
under transcriptional control of the GAPDH promoter/ regulation 
and the GAPDH termination/ polyadenylation region into the yeast 
chromosome . 

Fig. 20 Fluorogram of( 35 S) cysteine labelled thaumatin-like protein 

synthesized by S. cerevisiae AH 22 cells containing plasmid pURY 
528-03 (lane c) or sythesized in vitro in a wheat germ protein 
synthesizing system under the direction of mRNA purified from 
arils of Thaumatococcus fruits (lane b) . Lane a shows 
radioactive marker proteins (obtained from Amersham) and lane d 
shows the position of thaumatin II isolated from the arils of 
Thaumatococcus fruits. Lysis of yeast cells was carried out 
as described in the legend of fig. 17. Wheat germ translation 
of mRNA and immuno precipitation procedures were carried out 
as described by L. Edens et al> Gene 18 , 1-12 (1982) • 

Fig. 21 Nucleotide sequence of the gene encoding (pre)(pro)(pseudo) 
chymosin in codon usage preferred by S. cerevisiae. 

Fig. 22 Nucleotide sequence of the gene encoding ( pre) ( pro) thaumatin 
in codon usage preferred by S. cerevisiae. 

Fig. 23 Nucleotide sequence\the genes encoding the leader sequences 
acid phosphatase and invertase in codon usage preferred 
by S. cerevisiae. 

Fig. 24 Schematic representation of the construction of the gene 

encoding prochymosin provided with the leader sequences of 
the invertase, acid phosphatase or thaumatin encoding genes, 
or with the two designed consensus leader sequences. 

Fig. 25 Nucleotide sequences of the two designed consensus leader 
sequences • 
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CLAIMS 

1 # A DNA sequence capable of initiating trans- 

cription by yeast RNA polymerase II which includes at 
least part of the regulon region of one of the GAPDH 
genes of S. cerevisiae , characterized in that it com- 
5 prises a DNA sequence essentially as given in Fig. 2, 
and wherein the regulon region is optionally modified 
to include at least one restriction enzyme cleavage 
site, to facilitate manipulation of the nucleotide 
sequence region for protein synthesis initiation. 

10 

2. A DNA sequence capable of both termination of 
the transcription by yeast RNA polymerase II and ef- 
fecting polyadenylation of the mRNA, which includes at 
least part of the termination/polyadenylation region 

15 belonging to one of the GAPDH genes of S. cerevisiae , 
characterised in that it comprises a DNA sequence 
essentially as 4 given in Pig. 3. 

3. DNA sequence, which can be inserted into a re- 
20 combinant DNA plasmid or into a yeast chromosome com- 
prising: 

(a) a DNA sequence according to claim 1, and 

(b) one or more structural genes different from the 
GAPDH genes of S. cerevisiae , and at least two of 

25 features (c)-(f), 

(c) one or more specific DNA sequences capable of ter- 
minating the transcription by yeast RNA polymerase 
II and effecting polyadenylation of the mRNA, 
and/or 

30 (d) one or more selection markers, and/or 

(e) either one or more nucleotide sequences allowing a 
stable insertion in a chromosome of yeasts or one 
or more DNA sequences which regulate DNA repli- 
cation in yeasts belonging to the genus Saccharo - 

35 myces or to the genus Kluy ver omyces , and/or 
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(f ) a DNA sequence encoding a signal polypeptide of not 
more than 30 amino acid residues assisting the 
translocation of proteins, which DNA sequence is 
situated upstream of and in the same reading frame 
5 as the structural gene. 

4. DNA sequence according to claim 3, character- 

ized in that the structural gene of (b) encodes thau- 
raatin or chymosin, or their various allelic or modified 
10 forms, which modified forms do not impair the sweet- 
tasting properties of thaumatin or the milk -clotting 
properties of chymosin, respectively, or precursors of 
these thaumatin- like or chymosin-like proteins. 

15 5. DNA sequence according to claim 3, charac- 

terized in that the specific DNA sequence of (c) is a 
DNA sequence essentially as given in Fig. 3. 

6. DNA sequence according to claim 3, charac- 

20 terized in that the DNA sequence of (f) encoding a sig- 
nal polypeptide is selected from the group consisting 
of DNA sequences encoding 

(a) the signal polypeptide translocating £. cerevisiae 
invertase, namely Met; Leu. Leu. Gin. Ala. Phe. Leu. Phe.- 

25 Leu . Leu . Ala • Gly . Phe • Ala • Ala . Lys .lie . Ser • Ala ; 

(b) the signal polypeptide translocating S. cerevisiae 
acid phosphatase, namely Met. Phe. Lys. Ser. Val. Val.- 
Tyr. Ser. lie. Leu. Ala. Ala. Ser. Leu. Ala. Asn.Ala; 

(c) the signal polypeptide of unmatured . forms , of 

30 thaumatin-like proteins, namely Met. Ala. Ala. Thr . - 

Thr - Cys . Phe . Phe . Phe . Leu .Phe . Pro . Phe . Leu . Leu . Leu . - 
Leu . Thr . Leu . Ser . Arg . Ala ; 

(d) the signal polypeptide of unmatured forms of chym- 
osin-like proteins, namely Met. Arg. Cys .Leu. Val. - 

35 Val . Leu . Leu * Ala . Val . Phe . Ala . Leu . Ser . Gin . Gly; and 

(e) two consensus signal polypeptides, namely 

Met . Ser . Lys . Ala . Ala . Leu . Ala . Phe . lie . Ala . Phe . Val . - 
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He. Val. Leu. lie. Val. Asn. Ala and 

Met: . Ser . Lys . Phe . Val • He .Val. Leu . lie . Val . Ala - Ala . - 
Leu . Ala . Phe . I le . Ala . Asn . Ala . 



5 7. DNA sequence according to claim 3, charac- 

terized in that either the codons of the structural 
gene of (b) or the codons of the signal polypeptide 
encoding DNA sequence of (f), or both, are modified 
into codons preferred by yeasts. 

10 

8. DNA sequence according to claim 7, charac- 

terized in that as codons preferred by yeasts the 
following codons are used: 



GCC 


or 


GCT 


alanine 


TTG 






leucine 


AGA 






arginine 


AAG 






lysine 


AAC 






asparagine 


ATG 






methionine 


GAC 


or 


GAT 


aspartic acid 


TTC 






phenylalanine 


TGT 






cysteine 


CCA 






proline 


CAA 






'glutamine 


TCC 


or 


TCT 


serine 


GAA 






glutamic acid 


ACC 


or 


ACT 


threonine 


GGT 






glycine 


TGG 






tryptophan 


CAC 






Kistidine 


TAC 






tyrosine 


ATC 


or 


ATT 


isoleucine 


GTC 


or 


GTT 


valine. 



25 9. Process for preparing yeasts containing rDNA 

sequences, characterized in that DNA sequences as 
claimed in claim 3 are introduced into Saccharomyces , 
Kluy veromy ces , Debaryomyces , Hansenula , Candida , 
Torulopsis or Rhodotorula yeasts, either in the form of 

30 plasmids or by incorporation in the yeast genome. 

10. Yeast containing an rDNA sequence as claimed 

in claim 3, either in the form of a plasmid or in- 
corporated in the yeast genome. 
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11. Process for preparing a protein by culti- 
vation of a yeast containing rDNA sequences, . charac- 
terized in that a yeast as claimed in claim 10 is used 
to produce a protein, whereby the rDNA sequence coh- 

5 tains a structural gene encoding that protein or a 

precursor thereof which precursor during processing can 
form the relevant protein. 

12. . Protein, produced by a process according to 
10 claim 11. 
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Fig.3. 
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Fig. 7a 
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LEADER SEQUENCE PREPROCHYMOSI N IN S. CER. PREFERRED CODONS 

ATGAGATGTT TGCTTGTTTT GTTCGCTGTT TTCGCTTTGT CCCAAGGT 
-174 -165 -155 -145 -135 -127 



UNA SEQUENCE PRO PART PRECEEDING PSEUDOCHYMOSIN IN S. CER. PREFERRED CODONS 

GCTGAAATTA CTAGAATTCC ATTGTACAAG GGTAAGTCCT TGAGAAAGGC TTTCAAGGAA 

-126 -116 -106 -96 -86 -76 

CACGGTTTGT TGGAAGACTT C 

-66 -56 -46 



DNA SEQUENCE PSEUDO PART PRECEEDING CHYMOSIN IN S. CER PREFERRED CODONS 

TTGCAAAAGC AACAATACGG TATTTCCTCC AAGTACTCCG GTTTC 
-45 -36 -26 -16 -6 -1 



DNA SEQUENCE CHYMOSIN IN S. CER- PREFERED CODONS 



10 

GGTGAAGTTG 
70 

TACTTGGGTA 
130 

TGGGTTCCAT 
190 

AGAAAGTCCT 
250 

TCCATGCAAG 
310 

CAAACTGTTG 
370 

CGTATTTTGG 
4 30 

AACATGATGA 
490 

GGTCAAGAAT 
550 

CACTGGGTTC 
610 

TCCGGTGTTG 
670 

AAGTTCGTTG 
7 30 

AACCAATACG 
790 

TTCGAAATTA 
6 50 

GGTTTCTGTA 
910 

GTTTTCATTA 
9 69 
AAGGCTATT 



20 

CTTCCGTTCC 
80 

CTCCACCACA 
140 

CCATTTACTG 
200 

CCACTTTCCA 
260 

GTATTTTGGG 
320 

GTTTGTCCAC 
380 

GTATGGCTTA 
440 

ACAGACACTT 
500 

CCATGTTGAC 
560 

CAGTTACTGT 
620 

TTGTTGCTTG 
680 

CTCCATCCTC 
740 

GTCAATTCGA 
800 

ACGGTAAGAT 
860 

CTTCCGGTTT 
920 

GAGAATACTA 



30 

ATTGACTAAC 
90 

AGAATTCACT 
150 

TAAGTCCAAC 

210. 

AAACTTGGGT 
270 

TTACGACACT 
330 

TCAAGAACCA 
390 

CCCATCCTTG 
450 

GGTTGCTCAA 
510 

TTTGGGTGCT 
570 

TCAACAATAC 
630 

TGAAGGTCGT 
690 

CGACATTTTG 
750 

CATTGACTCT 
810 

GTACCCATTG 
870 

CCAATCCGAA 
930 

CTCCGTTTTC 



40 

TACTTGGACT 
100 

GTTTTGTTCG 
160 

GCTTGTAAGA 
220 

AAG CCATTGT 
280 

GTTACTGTTT 
340 

GGTGACGTTT 
400 

GCTTCCG AAT 
460 

GACTTGTTCT 
520 

ATTGACCCAT 
580 

TGGCAATTCA 
640 

TGTCAAGCTA 
700 

AACATTCAAC 
760 

GACAACTTGT 
820 

ACTCCATCCG 
880 

AACCACTCCC 
940 

GACAG AGCTA 



50 

CCCAATACTT 
110 

ACACTGGTTC 
170 

ACCACCAAAG 
230 

CCATTC ACTA 
290 

CCAACATTGT 
350 

TCACTTACGC 
410 

ACTCCATTCC 
470 

CCGTTTACAT 
530 

CCTACTACAC 
590 

CTGTTGACTC 
650 

TTTTGGACAC 
710 

AAGCTATTGG 
770 

CCTACATGCC 
830 

CTTACACTTC 
890 

AAAAGTGGAT 
950 

ACAACTTGCT 



60 

CGGTAAGATT 
120 

CTCCGACTTC 
180 

ATTCGACCCA 
240 

CGGTACTGGT 
300 

TGACATTCAA 
360 

TGAATTCGAC 
420 

AGTTTTCGAC 
480 

GGACAGAAAC 
540 

TGGTTCCTTG 
600 

CGTTACTATT 
660 

TGGTACTTCC 
7 20 

TGCTACTCAA 
780 

AACTGTTGTT 
840 

CCAAG ACCAA 
900 

TTTGGGTGAC 
960 

TGGTTTGGCT 
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LEADER SEQUENCE OF PREP ROTH AUMAT1N IN S.CER. PREFERRED CODONS 

ATCGCTGCTA CTACTTGTTT CTTCTTCTTG TTCCCATTCT TGTTGTTGTT GACTTTGTCC 
-66 -57 -47 -37 -27 -17 -7 

ACAGCT 
-1 



DNA SEQUENCE MATURE THAUMATIN IN S. CER. PREFERRED CODONS 



10 


20 


30 


GCTACTTTCG 


AAATTGTTAA 


CAGATGTTCC 


70 


80 


90 


GACGCTGCTT 


TCGACGCTGC 


TGGTAGACAA 


130 


140 


150 


GTTGAACCAG 


GTACTAAGGG 


TGCTAAGATT 


190 


200 


210 


TCCGGTACAG 


GTATTTGTAG 


AACTGGTCAC 


2 50 


260 


270 


GGTAGACCAC 


CAACTACTTT 


GGCTGAATTC 


310 


320 


330 


GACATTTCCA 


ACATTAAGGG 


TTTCAACGTT 


370 


380 


390 


TGTAGAGGTG 


TTAGATGTGC 


TCCTGACATT 


430 


440 


450 


CCAGGTGGTG 


GTTGTAACGA 


CCCTTGTACT 


490 


500 


510 


ACTGGTAAGT 


GTGGTCCAAC 


TGAATACTCC 


550 


560 


570 


TTCTCCTACG 


TTTTGCACAA 


GCCAACTACT 


610 


620 




GTTACTTTCT 


GTCCAACTGC 


T 



40 


50 


60 


TACACTGTCT 


GGGCTGCTGC 


TTCCAAGGGT 


100 


110 


120 


TTGAACTCCG 


GTGAATCCTG 


CACTATTAAC 


160 


170 


180 


TGGGCTAGAA 


CTGACTGTTA 


CTTCG ACGAC 


220 


230 


240 


TGTGCTGGTT 


TGTTGCAATG 


TAAGAGATTC 


280 


290 


300 


TCCTTGAACC 


AATACGGTAA 


GGACTACATT 


340 


350 


360 


CCAATGTACT 


TCTCCCCAAC 


TACTAGAGGT 


400 


410 


420 


GTTGGTCAAT 


GTCCAGCTAA 


CTTGAAGGCT 


460 


470 


480 


GTTTTCCAAA 


CTTCCGAATA 


CTGTTGTACT 


520 


530 


540 


AGATTCTTCA 


AGAGATTGTG 


TCCAGACGCT 


580 


590 


600 


GTTACTTGTC 


CAGGTTCCTC 


CAACTACAGA 



DNA SEQUENCE ACIDIC PEPTIDE OF FROTHAUMATIN IN S.CER. PREFERRED CODONS 

622 632 
TTGGAAITCG AACACGAA 
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I 



LEADER SEQUENCE ACIDIC PHOSPHATASE IN YEAST PREFERRED CODONS 

ATGTTCAAGT CCGTTGTTTA CTCCATTTTG GCTGCTTCCT TGGCTAACGC T 
-51 -41 -31 -21 -11 -1 



LEADER SEQUENCE INVERTASE IN YEAST PREFERED CODONS 

ATGTTGTTGC AAGCTTTCTT GTTCTTGTTG GCTGGTTTCG CTGCTAAGAT TTCCGCT 
-57 -47 -37 -27 -17 -7 



Fig.23. . 
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CONSENSUS LEADER SEQUENCE (A-B) IN YEAST PREFERRED CODONS 

ATGTCCAAGG CTGCTTTGGC TTTCATTGCT TTCGTTATTG TTTTGATTGT TAACGCT 
-48 -38 -28 -18 -8 -1 



CONSENSUS LEADER SEQUENCE (B-A) IN YEAST PREFERRED CODONS 

ATGTCCAAGT TCGTTATTGT TTTGATTGTT GCTGCTTTGG CTTTCATTGC TAACGCT 
-48 -38 -28 -18 -8 -1 
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